Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods

نویسندگان

  • Huw A. Ogilvie
  • Joseph Heled
  • Dong Xie
  • Alexei J. Drummond
چکیده

Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance evaluation of block-based copy- move image forgery detection algorithms

Copy-move forgery is a particular type of distortion where a part or portions of one image is/are copied to other parts of the same image. This type of manipulation is done to hide a particular part of the image or to copy one or more objects into the same image. There are several methods for detecting copy-move forgery, including block-based and key point-based methods. In this paper, a method...

متن کامل

Feature Extraction and Efficiency Comparison Using Dimension Reduction Methods in Sentiment Analysis Context

Nowadays, users can share their ideas and opinions with widespread access to the Internet and especially social networks. On the other hand, the analysis of people's feelings and ideas can play a significant role in the decision making of organizations and producers. Hence, sentiment analysis or opinion mining is an important field in natural language processing. One of the most common ways to ...

متن کامل

Predicting Gestational Diabetes Using an Intelligent Neural Network Algorithm

Introduction: Due to the large amount of data on people with diabetes, it is very difficult to extract the predictors of diabetes. Data mining science has achieved this important goal with the help of its effective methods with the aim of discovering the prediction of diseases and has helped physicians and medical staff in predicting and diagnosing diseases.    Methods: The present research is...

متن کامل

Naive binning improves phylogenomic analyses

MOTIVATION Species tree estimation in the presence of incomplete lineage sorting (ILS) is a major challenge for phylogenomic analysis. Although many methods have been developed for this problem, little is understood about the relative performance of these methods when estimated gene trees are poorly estimated, owing to inadequate phylogenetic signal. RESULTS We explored the performance of som...

متن کامل

Verification and Validation of Common Derivative Terms Approximation in Meshfree Numerical Scheme

In order to improve the approximation of spatial derivatives without meshes, a set of meshfree numerical schemes for derivative terms is developed, which is compatible with the coordinates of Cartesian, cylindrical, and spherical. Based on the comparisons between numerical and theoretical solutions, errors and convergences are assessed by a posteriori method, which shows that the approximations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 65  شماره 

صفحات  -

تاریخ انتشار 2016